Using YOLO v7 to Detect Kidney in Magnetic Resonance Imaging

Introduction This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats. Methods Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys. A total of 5657 MRI scans for 1084 patients were retrieved. 326 patients with 1034 tumors recruited from a retrospective maintained database, and bounding boxes were drawn around their tumors. A primary model was trained on 80% of annotated cases, with 20% saved for testing (primary test set). The best primary model was then used to identify tumors in the remaining 861 patients and bounding box coordinates were generated on their scans using the model. Ten benchmark training sets were created with generated coordinates on not-segmented patients. The final model used to predict the kidney in the primary test set. We reported the positive predictive value (PPV), sensitivity, and mean average precision (mAP). Results The primary training set showed an average PPV of 0.94 ± 0.01, sensitivity of 0.87 ± 0.04, and mAP of 0.91 ± 0.02. The best primary model yielded a PPV of 0.97, sensitivity of 0.92, and mAP of 0.95. The final model demonstrated an average PPV of 0.95 ± 0.03, sensitivity of 0.98 ± 0.004, and mAP of 0.95 ± 0.01. Conclusion Using a semi-supervised approach with a medical image library, we developed a high-performing model for kidney detection. Further external validation is required to assess the model’s generalizability.


Introduction
This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats.

Methods
Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys.A total of 5657 MRI scans for 1084 patients were retrieved.326 patients with 1034 tumors recruited from a retrospective maintained database, and bounding boxes were drawn around their tumors.A primary model was trained on 80% of annotated cases, with 20% saved for testing (primary test set).The best primary model was then used to identify tumors in the remaining 861 patients and bounding box coordinates were generated on their scans using the model.Ten benchmark training sets were created with generated coordinates on not-segmented patients.The final model used to predict the kidney in the primary test set.We reported the positive predictive value (PPV), sensitivity, and mean average precision (mAP).

Results
The primary training set showed an average PPV of 0.94 ± 0.01, sensitivity of 0.87 ± 0.04, and mAP of 0.91 ± 0.02.The best primary model yielded a PPV of 0.97, sensitivity of 0.92, and mAP of 0.95.The final model demonstrated an average PPV of 0.95 ± 0.03, sensitivity of 0.98 ± 0.004, and mAP of 0.95 ± 0.01.

Conclusion
Using a semi-supervised approach with a medical image library, we developed a high-performing model for kidney detection.Further external validation is required to assess the model's generalizability.

Introduction:
With the advancement in medical imaging technology, the ability to detect anatomical structures has become more accurate, timely, and efficient.Cross-sectional imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI) provide valuable information regarding diagnosis, classification and treatment options for kidney cancer (1).In 2021, kidney cancer accounted for 4-5% of all new cancer diagnoses in the United States, and the incidence has been rising for the past three decades (2,3).The increasing incidental detection of renal masses is most likely due to the increase in the use of cross-sectional imaging (4).In addition, incidentally identified renal masses, often indolent or benign lesions, have led to further use of abdominal imaging for surveillance (5).
Integrating deep learning algorithms such as convolutional neural networks (CNN) has improved the annotation of medical image segmentation.Though there is great benefit in using CNNs, they require significant time and effort to create manually annotated training images.To reduce this initial workload, semi-supervised learning (SSL) methods have enabled the use of a small set of manually annotated images with the ability to apply that to an existing large set of unlabeled data (6).
You Only Look Once-version 7 (YOLOv7) is a deep learning algorithm used for object detection, with the ability to utilize SSL, with speed and accuracy.Several studies have utilized deep learning algorithms like YOLOv3 for object detection of pathologies such as retinal breaks and colonic polyps (7)(8)(9)(10).However, in our study, we aim to evaluate the accuracy of YOLOv7 for detecting kidney parenchyma.In this study, we developed and evaluated a model for kidney detection using YOLOv7 semi-supervised learning algorithm.

Patient cohort
This retrospective study examined MRIs from 1084 patients with renal masses who underwent partial or radical nephrectomy or active surveillance between January 2003 to June 2022.two hundred and six patients did not have renal mass.An IRB authorization was obtained for patient recruitment, and signed informed consent was acquired.

YOLO V7 modification
By courtesy of Chien-Yao Wang et al. we used the YOLO V7 code and modified it to enable the machine to read, write and predict the objects in '.nii','.nii.gz', and '.dcm' formats.The modified code is stored in our lab's GitHub repository (11).

Model production
Step 1.Primary model: We trained ten benchmarks of segmented patient scans using YOLOV7.In this training and test set, 80% of patients were randomly assigned to the training group and 20% to the test group.After the model's creation, positive predictive value (PPV), sensitivity, and mean average precision (mAP) for the performance of the models were evaluated, and the best model was chosen.
Step 2. Detect the object in the dataset: Detection on all additional non-segmented images was conducted using the chosen model with the best performance in step 1.The corresponding detected kidney coordinates were stored as text files.
Step 3. Train on all scans: Another model was trained using the bounding boxes, which we defined in step 2. A total of 861 patients were used in this step to train the model.Step 1 test sets were employed in this set, and any linked scans to those test patients were eliminated from the train set.For training in step 3, weights from the primary model were used to reduce possible false positive training.Model performance was recorded and reported separately as the final results.The learning rate was set at 0.001, and the batch size was 120.The categorical crossentropy was optimized using the Adam optimizer, with momentum set to 0.9 and weight decay at 0.00005 (12).For each epoch, model weight values were constructed (iterations through the entire data-set).After 100 epochs, the training was terminated due to the lack of further progress in cross entropy and accuracy (list of used hyperparameters added to supplementary).

Demographic
Out of the total 1084 patients with renal cell carcinoma included in the study, 57% were male.The median age of the population was 57 (mean = 55.3 ± 14.7).While all patients had at least one renal mass, 79.6% of patients had available pathology for histologic concordance (Table 2).

Model's performance
The best performance for the primary model had a 0.97 PPV and 0.92 sensitivity with a 0.95 mAP which we used to detect the kidney in the rest of the unsegmented scans.The PPV-sensitivity diagram related to the primary model is demonstrated in Figure 2. The best final performance was for the same benchmark, with a PPV of 0.99, sensitivity of 0.99, and mAP of 0.98 (Table 3).In this diagram sensitivity showed to be 0.97.was 0.9184.Though we did not segment the lesions and kidney, detecting the kidney in MRI can be challenging, which speaks to the high performance of our approach.Our method, successfully detected renal parenchyma with a final detection performance of 0.98 for mAP, demonstrating its accuracy in finding the kidney parenchyma.This achievement highlights the potential of our model in handling the complexities of MRI-based kidney detection, while also showcasing its adaptability to various medical imaging formats.Furthermore, our model's robust performance contributes to the growing body of evidence supporting the use of advanced machine learning techniques in medical imaging and renal health.
Other study in this field that provided a model for kidney parenchymal segmentation were conducted on normal kidney structures, the study by Taro Langner et al. who examined the feasibility of automatically segmenting the renal parenchyma using the UK Biobank MRIs, involving approximately 40,000 healthy volunteers, with a DICE similarity scale of 0.96 (13).In comparison, our model achieved a final detection performance of 0.98 for mAP, indicating a high accuracy in detecting kidney parenchyma.The difference in performance could be attributed to various factors such as our study employing YOLO V7 for detection, a diverse dataset from different scanners, and focusing on detecting kidney parenchyma rather than segmentation.
However, direct comparison of the two studies might not be entirely appropriate due to differences in task, methodology, and dataset, but both demonstrate promising results in the field of kidney segmentation and detection, paving the way for improved diagnosis and treatment of renal conditions.
An important limitation we can identify in our study is not using an external validation of the model.Using many images (more than 3 million images) and several types of scanners may make it easier to anticipate the outcome of external validation.However, we may still need external validation to confidently verify the model's performance.Another limitation is our model's focus on detecting renal parenchyma and not addressing the detection of other important kidney structures, such as the renal pelvis, calyces, and vasculature.This limitation may impact the model's applicability in assessing various kidney conditions that require the evaluation of these structures.Furthermore, our study did not evaluate the integration of the model into clinical workflows or its impact on clinical decision-making.Further research is needed to understand how the model can be best implemented and used by healthcare professionals to improve patient care.

Conclusion:
This study's encouraging findings suggest that the developed model might be used for kidney parenchyma detection.In addition, it demonstrated that modified YOLOv7 code may be used on medical imaging format directly to construct YOLOv7 models.This investigation must also be conducted on scans from external institutions to prove its validity.

A
total of 5657 MRI scans from different time points were included.Five different types of scanners were used to capture images.Additional technical information on the MRI scanners is available in supplementary table 1.The following sequences were performed on the patients: multiplanar T2, pre-contrast T1, and post-contrast T1.Post-contrast images were performed in corticomedullary (20-second), nephrogenic (70-second), and excretory (3-minute) phases after contrast material administration.Out of the total patients in the study, kidneys were segmented in a subset of 223 patients on the excretory phase using ITK-SNAP (version 3.8) by two postdoctoral radiology research fellows.An abdominal radiologist confirmed all segmentations with MRI fellowship training (AAM, 14 years of experience).All segmentations were converted to bounding boxes using preprocessing codes.[https://github.com/Translationalimaginglab/YOLOV7-RCC]Image Preprocessing All 3D DICOM images were converted to Nifti format, and each slice was separated and saved.Each 3-minute post-contrast image was identified and down-sampled to 1 mm x 1 mm x 1 mm axial slices.Using Rician normalization, all MRIs were normalized.Examples of these MRIs and ground truth segmentations can be seen in Figure 1.

Figure 1 .
Figure 1.Kidney ground truth and related performance images.A. The ground truth is manually produced.B. Detection performance using the primary model.

Figure 2 .
Figure 2. Study fellow diagram.In this diagram you can see the process of study step by

Figure 3 .
Figure 3.Primary best model performance diagrams.A. F1 score of 0.95 at 0.45 confidence was calculated.B. PPV was 0.97 C. PPV-Sensitivity curve showed that mAP of 0.5 was 0.95 D.

Figure 4 .
Figure 4. Final best model performance diagrams.A. F1 score of 0.98 at 0.65 confidence was calculated.B. PPV was 0.99 C. PPV-Sensitivity curve showed that mAP of 0.5 was 0.98 D. In this diagram sensitivity showed to be 0.99.